{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) MONAI Consortium  \n",
    "Licensed under the Apache License, Version 2.0 (the \"License\");  \n",
    "you may not use this file except in compliance with the License.  \n",
    "You may obtain a copy of the License at  \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;http://www.apache.org/licenses/LICENSE-2.0  \n",
    "Unless required by applicable law or agreed to in writing, software  \n",
    "distributed under the License is distributed on an \"AS IS\" BASIS,  \n",
    "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  \n",
    "See the License for the specific language governing permissions and  \n",
    "limitations under the License. \n",
    "\n",
    "# MONAI Auto3DSeg Hyper-parameter Optimization with Optuna\n",
    "\n",
    "This notebook provides an example to perform hyperparameter optimization(HPO) on learning rate with grid search method for hippocampus segmentation using Optuna.\n",
    "\n",
    "Note: if you have used other Auto3DSeg notebooks on the hippocampus dataset (MSD Task 04), for example: \n",
    "- `auto_runner.ipynb`\n",
    "- `auto3dseg_autorunner_ref_api.ipynb`\n",
    "- `auto3dseg_hello_world.ipynb`\n",
    "- `hpo_nni.ipynb`\n",
    "\n",
    "You may have already generated the algorithm templates in MONAI bundle formats (hint: find them in the working directory). "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup environment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -c \"import monai\" || pip install -q \"monai-weekly[nibabel, tqdm, cucim, yaml, optuna, nni]\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import optuna\n",
    "import tempfile\n",
    "\n",
    "from functools import partial\n",
    "\n",
    "from monai.apps import download_and_extract\n",
    "from monai.apps.auto3dseg import BundleGen, DataAnalyzer, OptunaGen\n",
    "from monai.apps.auto3dseg.utils import export_bundle_algo_history, import_bundle_algo_history\n",
    "from monai.bundle.config_parser import ConfigParser\n",
    "from monai.config import print_config\n",
    "from monai.utils.enums import AlgoKeys\n",
    "\n",
    "print_config()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download dataset\n",
    "\n",
    "We provide a toy datalist file that splits a subset of the downloaded datasets into five folds.\n",
    "\n",
    "> NOTE: Each validation set only has 6 images in one fold of training.\n",
    "> Therefore, we need to set a limit on the total number of GPUs we're using in this notebook.\n",
    "> If you are using a system with more than 6 GPUs, please set the environment \"CUDA_VISIBLE_DEVICES\" to be \"0,1,2,3,4,5\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "directory = os.environ.get(\"MONAI_DATA_DIRECTORY\")\n",
    "if directory is not None:\n",
    "    os.makedirs(directory, exist_ok=True)\n",
    "root_dir = tempfile.mkdtemp() if directory is None else directory\n",
    "print(root_dir)\n",
    "\n",
    "msd_task = \"Task04_Hippocampus\"\n",
    "resource = \"https://msd-for-monai.s3-us-west-2.amazonaws.com/\" + msd_task + \".tar\"\n",
    "\n",
    "compressed_file = os.path.join(root_dir, msd_task + \".tar\")\n",
    "dataroot = os.path.join(root_dir, msd_task)\n",
    "if not os.path.exists(dataroot):\n",
    "    download_and_extract(resource, compressed_file, root_dir)\n",
    "\n",
    "datalist_file = os.path.join(\"..\", \"tasks\", \"msd\", msd_task, \"msd_\" + msd_task.lower() + \"_folds.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define experiment file pathes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Experiment setup\n",
    "work_dir = \"./hpo_optuna_work_dir\"\n",
    "datastats_file = os.path.join(work_dir, \"datastats.yaml\")\n",
    "if not os.path.isdir(work_dir):\n",
    "    os.makedirs(work_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare an input YAML\n",
    "\n",
    "This is an example for the dataset.\n",
    "User should generate their own."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_cfg = {\n",
    "    \"name\": msd_task,  # optional, it is only for your own record\n",
    "    \"task\": \"segmentation\",  # optional, it is only for your own record\n",
    "    \"modality\": \"MRI\",  # required\n",
    "    \"datalist\": datalist_file,  # required\n",
    "    \"dataroot\": dataroot,  # required\n",
    "}\n",
    "input = \"./input.yaml\"\n",
    "ConfigParser.export_config_file(input_cfg, input)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Bundle Generators\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if not os.path.exists(datastats_file):\n",
    "    da = DataAnalyzer(datalist_file, dataroot, output_path=datastats_file)\n",
    "    da.get_all_case_stats()\n",
    "\n",
    "# algorithm generation\n",
    "bundle_generator = BundleGen(\n",
    "    algo_path=work_dir,\n",
    "    data_stats_filename=datastats_file,\n",
    "    data_src_cfg_name=input,\n",
    ")\n",
    "\n",
    "bundle_generator.generate(work_dir, num_fold=1)\n",
    "history = bundle_generator.get_history()\n",
    "export_bundle_algo_history(history)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Algo object from bundle_generator history\n",
    "\n",
    "Algorithm selected to do HPO.\n",
    "Refer to bundle history for the mapping between algorithm name and index.\n",
    "Users can `get_history` from `bundle_generator`, or `import_bundle_algo_history` by reading bundles saved in the working directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    history = bundle_generator.get_history()\n",
    "    assert len(history) > 0\n",
    "except Exception:\n",
    "    history = import_bundle_algo_history(work_dir, only_trained=False)\n",
    "\n",
    "print(\"algorithms imported from the history:\")\n",
    "for i, algo_dict in enumerate(history):\n",
    "    print(f\"{i}: \", algo_dict[AlgoKeys.ID])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Select an algorithm to perform HPO\n",
    "Note: The name of the algorithms has a convention: `{net}_{fold_index}_{other_meta_info}`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for algo_dict in history:\n",
    "    if algo_dict[AlgoKeys.ID].split(\"_\")[0] == \"segresnet\":\n",
    "        break\n",
    "algo_name = algo_dict[AlgoKeys.ID]\n",
    "algo = algo_dict[AlgoKeys.ALGO]\n",
    "print(f\"{algo_name} is selected. \")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (Optional) Override params to complete training and HPO for demo purpose\n",
    "\n",
    "Users can override the parameters of choice before the HPO runs.\n",
    "In such case, users need to provide an `override_param`.\n",
    "The previously-generated algorithm will not be touched.\n",
    "`NNIGen` will create a copy of the algorithm and save the algorithm in a new folder named `{net}_{fold_index}_override`.\n",
    "\n",
    "For more information about creating overriding parameters, please refer to the section \"Override Specific Parameters in the Algorithms before HPO\" in this [tutorial documentation](../docs/hpo.md)\n",
    "\n",
    "> NOTE: The setup works fine for a machine that has GPUs less than or equal to 8.\n",
    "> The datalist in this example is only using a subset of the original dataset.\n",
    "> Users need to ensure the number of GPUs is not greater than the number that the training dataset can be partitioned.\n",
    "> For example, the following code block is not suitable for a 16-GPU system.\n",
    "> In such cases, please change the code block accordingly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# \"override_params\" is used to update algorithm hyperparameters\n",
    "# like num_epochs, which are not in the HPO search space. We set num_epochs=2\n",
    "# to shorten the training time as an example\n",
    "\n",
    "max_epochs = 2\n",
    "\n",
    "# for segresnet2d\n",
    "override_param = {\n",
    "    \"num_epochs_per_validation\": 1,\n",
    "    \"num_epochs\": max_epochs,\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Optuna Generator class and overwrite get_hyperparameters() function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class OptunaGenLearningRate(OptunaGen):\n",
    "    def get_hyperparameters(self):\n",
    "        return {\"learning_rate\": self.trial.suggest_float(\"learning_rate\", 0.00001, 0.1)}\n",
    "\n",
    "\n",
    "optuna_gen = OptunaGenLearningRate(algo=algo, params=override_param)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run Optuna optimization (with grid search)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "search_space = {\"learning_rate\": [0.0001, 0.001, 0.01, 0.1]}\n",
    "study = optuna.create_study(sampler=optuna.samplers.GridSampler(search_space), direction=\"maximize\")\n",
    "study.optimize(partial(optuna_gen, obj_filename=optuna_gen.get_obj_filename(), output_folder=work_dir), n_trials=2)\n",
    "print(\"Best value: {} (params: {})\\n\".format(study.best_value, study.best_params))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "vscode": {
   "interpreter": {
    "hash": "d4d1e4263499bec80672ea0156c357c1ee493ec2b1c70f0acce89fc37c4a6abe"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
